Cross-Language Information Retrieval using Compound Word Translation

نویسندگان

  • Atsushi Fujii
  • Tetsuya Ishikawa
چکیده

This paper proposes a cross-language information retrieval (CLIR) system queried with technical compound words. Our system rst translates queries into the target language. Instead of exhaustively enumerating new compound words in a bilingual dictionary, we produce a dictionary for base words, and compute the plausibility score for each combination of base word translations. Then, the most plausible combination is used for the subsequent retrieval process. Experimental results showed that our system outperforms baseline CLIR systems. We also propose an interaction strategy to facilitate user feedback.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Japanese/English Cross-Language Information Retrieval: Exploration of Query Translation and Transliteration

Cross-language information retrieval (CLIR), where queries and documents are in different languages, has of late become one of the major topics within the information retrieval community. This paper proposes a Japanese/English CLIR system, where we combine a query translation and retrieval modules. We currently target the retrieval of technical documents, and therefore the performance of our sy...

متن کامل

Using Transliteration of Proper Names from Arabic to Latin Script to Improve English-Arabic Word Alignment

Bilingual lexicons of proper names play a vital role in machine translation and cross-language information retrieval. Word alignment approaches are generally used to construct bilingual lexicons automatically from parallel corpora. Aligning proper names is a task particularly difficult when the source and target languages of the parallel corpus do not share a same written script. We present in ...

متن کامل

A Method using Language Grid and Concept Base for Japanese- English Cross-language Information Retrieval

This paper describes query translation using language resources and a concept base method for Cross-language Information Retrieval (CLIR). In the proposed method, queries are translated by multiple machine translation systems on the Language Grid. The queries are then expanded by using a bilingual dictionary to translate compound words or word phrases. In addition, documents related to the tran...

متن کامل

Disambiguation of Compound Noun Translations Extracted from Bilingual Comparable Corpora

Bilingual machine readable dictionaries are important and indispensable information resources for cross-language information retrieval, machine translation, and so on. In this paper, we describe a bilingual dictionary acquisition system which extracts translations from non-parallel but comparable corpora of a specific academic domain and disambiguates the extracted translations. We also experim...

متن کامل

Cross-Lingual Information Retrieval Problems: Methods and findings for three language pairs

In this paper we will discuss dictionary-based cross-language information retrieval (CLIR) methods, and report recent findings and problems. We will consider three language pairs for CLIR: Finnish to English, English to Finnish, Swedish to English. We show that Finnish and Swedish have special features, e.g., the frequency of homography and a high frequency of compound words that affect retriev...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999